Outline

Background

Multiple trait selection

Multi-trait model

Background

  • Breeders select for multiple traits (morphological, physiological, biotic resistance,..,etc)

  • The list of traits can be endless, each with difference importance

  • Program goals, selection cycle, breeding (or generation) cycle

 

Fig 1. Model or ideal plant (IDEOTYPE) with attributes combined to maximize yield
(Furbank et al. 2019).

Correlation between traits

  • Phenotypic correlation: linear association between the phenotypic values of two traits.

  • Genetic correlation: linear association between the genetic (breeding) values of two traits.

  • Environmental correlation: linear association between the non-additive genetic, and nongentic effects of two traits.

\[r_{P} = \frac {{COV}_{Pxy}}{{\sigma}_{Px}{\sigma}_{Py}}\]

\[{{COV}_{Pxy}} = r_{P} ({\sigma}_{Px}{\sigma}_{Py})\]

\[{{COV}_{Pxy}} = {{COV}_{Gxy}} + {{COV}_{Exy}}\]

\[\begin{align} r_{P} ({\sigma}_{Px}{\sigma}_{Py}) &= r_{G} ({\sigma}_{Gx}{\sigma}_{Gy}) + r_{E} ({\sigma}_{Ex}{\sigma}_{Ey}) \\ & \vdots \\ r_{P} &= {h}_{x}{h}_{y}r_{G} + {e}_{x}{e}_{y}r_{G} \\ \end{align}\]

  • Phenotypic correlation results from the combination of genetic and enviromental correlations.

  • If both traits have low heritabilities then the phenotypic correlation is mainly due to enviromental correlations.

Genetic correlation

  • Greatest interest for breeders

  • Positive or negative and favorable or unfavorable, depending on goals. Ex 1: plant height and flowering date in barley are positively correlated. This correlation is favorable if harvest grains is the goal (short and early-maturing). But for forage (tall and late-maturing) is unfavorable (Bernardo et al. 2002).

  • Causes

    • Linkage: genes located on the same chromosome are genetically linked and do not segregate independently, like genes located on different chromosomes (can be dissipated by cycles of meiosis).

    • Pleiotropy: two traits are controlled by the same gene (can not be dissipated, more permanent).

Response to selection

The mean genetic change in the trait of interest in a population.

Direct response to selection

The selection criteria is based on the trait(s) of interest in the target environment.

\[ Rx = {k}_{p} {h}_{x}{\sigma_a}x\] \[Ry = {k}_{p} {h}_{y}{\sigma_a}y\]

\({k}_{p}\): is the selection differential when the proportion of selected individuals is \(p\);
\({h}\): is the accuracy of individual selection (direct selection);
\({\sigma_a}\): is the standard deviation of breeding values.

Indirect selection

  • The criteria is based on trait(s) that may be associated with the trait(s) of interest (secondary traits). Ex: number of stalks and tons of cane per hectare. Also, the same trait measured in different environments(Rutkoski 2019):

Fig 2. Yield of maize in tropical environment

 

Fig 3. Yield of maize in temparate environment

Correlated response to selection

  • Correlated response to selection: selection for one trait will cause a correlated response in the other (if genetic correlation exists). The expected response for a trait Y, when selection is applied to another trait X:

\[CR_{Y} = {k}_{p} {h}_{x} {h}_{y}{r}_{g}\sigma_{p_{Y}}\]

  • Indirect selection is expected to be better than direct selection if the secondary trait has substantially higher heritability, and the genetic correlation is high.

  • Indirect selection can also be advantageous if much larger population sizes are possible compared to direct selection. Also, if secondary trait is easier and/or cheaper to evaluate, and measurable on both sex for example.

Multiple trait selection

Examples of multiple-trait selection strategies breeders employ…

Tandem selection

Selection for one trait until that trait is improved, then for a second, etc., until each has been improved to the desired level. Applicable in recurrent selection programs. Ex: tropical maize populations used as breeding material in temperate regions are first selected for photoperiod insensivity prior to selection for other traits.

Code
set.seed(1234)
x <- rnorm(200, mean = 10, sd = 4)
y <- rnorm(200, mean = 6, sd = 2)

dat <- data.frame(x,y)

# Visualize the scatter plot of traits A and B
plot(x,y, pch=19, ylim = c(min(dat$y), max(dat$y)),xlim = c(min(dat$x), max(dat$x)),rect(par("usr")[1], par("usr")[3],
          par("usr")[2], par("usr")[4], col = "lightgrey"),
     cex.lab=1.5, cex.axis=1.8,
     xlab="Trait A", ylab="Trait B",col=ifelse(y >5, 'red', 'blue'))
title(main = "SELECT FOR TRAIT B FOR SEVERAL CYCLES",  cex.main = 2.5, col.main= "darkgreen")
abline(h=5, col="black", lwd = 2, lty = 2)

Code
set.seed(1234)
x <- rnorm(200, mean = 10, sd = 4)
y <- rnorm(200, mean = 6, sd = 2)

dat <- data.frame(x,y)

# Visualize the scatter plot of traits A and B
plot(x,y, pch=19, ylim = c(min(dat$y), max(dat$y)),xlim = c(min(dat$x), max(dat$x)),
     rect(par("usr")[1], par("usr")[3],
          par("usr")[2], par("usr")[4], col = "lightgrey"),
     cex.lab=1.5, cex.axis=1.8,
     xlab="Trait A", ylab="Trait B",col=ifelse(x > 10, 'red', 'blue'))
title(main = "SELECT FOR TRAIT A FOR SEVERAL CYCLES",  cex.main = 2.5, col.main= "darkgreen")
abline(v=10, col="black", lwd = 2, lty = 2)

Independent culling levels

A certain level of merit (minimun level of performance) is established for each trait, and all individuals below that level are discarded regardless of values for other traits.

Code
set.seed(1234)
x <- rnorm(200, mean = 10, sd = 4)
y <- rnorm(200, mean = 6, sd = 2)

dat <- data.frame(x,y)

# Visualize the scatter plot of traits A and B
plot(x,y, pch=19, ylim = c(min(dat$y), max(dat$y)),xlim = c(min(dat$x), max(dat$x)),
     rect(par("usr")[1], par("usr")[3],
          par("usr")[2], par("usr")[4], col = "lightgrey"),
     cex.lab=1.5, cex.axis=1.8,
     xlab="Trait A", ylab="Trait B",col=ifelse(x > 10 & y >5, 'red', 'blue'))
abline(v=10, h=5, col="black", lwd = 2, lty = 2)

Index selection

Select for \(n\) traits simultaneously by using some index of net merit:

\[I = b_{1}X_{1} + b_{1}X_{1} + ... + b_{n}X_{n}\] \(b_{1}\) is the weight for trait \(i\) and \(X_{1}\) is the phenotypic value for trait \(i\).

Example adapted from (Bernardo et al. 2002):
Price of 60 kg bag of maize: R$ 90,00
To dry a 60 kg bag of maize: R$ 0,25
Target moisture concentration: 13%

A new candidate maize hybrid genotype has a yield of 100 bags of 60kg/ha and has a moisture of 15%. Then, the profit can be expressed in the form of the following selection index:

\[I = (90)(Yield) - (0,25)(Moisture - 13\%)(Yield)\]

Uni-trait model

The BLUP methodology uses mixed models for the genetic analysis and provides accurate and least biased prediction of breeding values.

Mixed model formulation

\[ y = \mathbf{X}b + \mathbf{Z}u + e \] \(y\): is an (nx1) vector of observations (phenotypes)
\(b\): is an (px1) vector with fixed effects (e.g. environment)
\(u\): is an (qx1) vector with random effects of breeding values; \(u \sim N(\mathbf{0},\mathbf{K} \sigma_{g}^2)\)
\(e\): is an (nx1) vector with random effects of residuals; \(e \sim N(\mathbf{0},\mathbf{R} \sigma_{e}^2)\)
\(\mathbf{X}\) and \(\mathbf{Z}\): are design matrices relating obs to fixed and random effects, respectively.

Uni-trait model

\[ y = \mathbf{X}b + \mathbf{Z}u + e \] \[\begin{align} \begin{bmatrix} u \\ e \\ \end{bmatrix} \sim MVN \begin{pmatrix} \begin{bmatrix} 0 \\ 0 \\ \end{bmatrix} , \begin{bmatrix} \sigma_{g}^2 \mathbf{K} & 0 \\ 0 & \sigma_{e}^2 \mathbf{R} \\ \end{bmatrix} \end{pmatrix} \end{align}\]

Advantages

  • Allows the analysis of unbalanced data

  • Exploit information from relatives

Furthermore, it can be extended to incorporate/account for more effects, such as:

  • Rows, columns, environments, permanent environmental…

  • Interaction terms (e.g GXE)

  • Covariance and correlation structures (relationship matrices and Correlation matrices)

  • Heterogeneous variances

  • Correlated traits

Multi-trait model

\[\begin{gather*} y_{i} = X_{i}b_{i} + Z_{i}u_{i} + e_{i} , & i = 1, 2,.. trait \end{gather*}\]

\[\begin{align} \begin{bmatrix} y_{1} \\ y_{2} \\ \end{bmatrix} = \begin{bmatrix} X_{1} & 0 \\ 0 & X_{2} \\ \end{bmatrix} \begin{bmatrix} b_{1} \\ b_{2} \\ \end{bmatrix} + \begin{bmatrix} Z_{1} & 0 \\ 0 & Z_{2} \\ \end{bmatrix}\begin{bmatrix} u_{1} \\ u_{2} \\ \end{bmatrix} + \begin{bmatrix} e_{1} \\ e_{2} \\ \end{bmatrix} \end{align}\]

\[\begin{align} \begin{bmatrix} u \\ e \\ \end{bmatrix} \sim MVN \begin{pmatrix} \begin{bmatrix} \mathbf{G}\otimes\mathbf{K} & 0 \\ 0 & \mathbf{\Sigma}\otimes{I} \\ \end{bmatrix} \end{pmatrix} \end{align}\]

\[\begin{align} G = \left[ {\begin{array}{cc} \sigma_{g_{11}}^2 & \sigma_{g_{12}}^2 \\ \sigma_{g_{21}}^2 & \sigma_{g_{22}}^2 \\ \end{array} } \right] \end{align}\]

\[\begin{align} \Sigma = \left[ {\begin{array}{cc} \sigma_{e_{11}}^2 & \sigma_{e_{12}}^2 \\ \sigma_{e_{21}}^2 & \sigma_{e_{22}}^2 \\ \end{array} } \right] \end{align}\]

Advantages

  • The prediction accuracy of low-heritability, difficult, and/or expensive to measure traits can be increased by using multi-trait models when the degree of correlation between traits is at least moderate, to improve the target trait or all the correlated traits simultaneously (Calus and Veerkamp 2011).

  • Multi-trait models can be useful for increasing prediction accuracy when the traits of interest are not measured in the individuals of the testing set, but were observed in individuals in the training set (Jia and Jannink 2012).

Hands-on using R

 Installing packages
 
 # Github version
 install.packages('devtools');
 library(devtools);
 install_github('covaruber/sommer')
 
 # or
 
 # CRAN version
 install.packages('sommer',dependencies = TRUE))
 library(sommer)

References

Bernardo, Rex et al. 2002. Breeding for Quantitative Traits in Plants. Vol. 1. Stemma press Woodbury, MN.
Calus, Mario PL, and Roel F Veerkamp. 2011. “Accuracy of Multi-Trait Genomic Selection Using Different Methods.” Genetics Selection Evolution 43: 1–14. https://gsejournal.biomedcentral.com/articles/10.1186/1297-9686-43-26.
Falconer, Douglas Scott. 1996. Introduction to Quantitative Genetics. Pearson Education India.
Furbank, Robert T, Jose A Jimenez-Berni, Barbara George-Jaeggli, Andries B Potgieter, and David M Deery. 2019. “Field Crop Phenomics: Enabling Breeding for Radiation Use Efficiency and Biomass in Cereal Crops.” New Phytologist 223 (4): 1714–27. https://nph.onlinelibrary.wiley.com/doi/full/10.1111/nph.15817.
Jia, Yi, and Jean-Luc Jannink. 2012. “Multiple-Trait Genomic Selection Methods Increase Genetic Value Prediction Accuracy.” Genetics 192 (4): 1513–22. https://pubmed.ncbi.nlm.nih.gov/23086217/.
Montesinos López, Osval Antonio, Abelardo Montesinos López, and José Crossa. 2022. Multivariate Statistical Machine Learning Methods for Genomic Prediction. Springer Nature. https://link.springer.com/book/10.1007/978-3-030-89010-0.
Rutkoski, Jessica E. 2019. “A Practical Guide to Genetic Gain.” Advances in Agronomy 157: 217–49. https://www.sciencedirect.com/science/article/abs/pii/S0065211319300549.